智能论文笔记

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

LSSANet: A Long Short Slice-Aware Network for Pulmonary Nodule Detection

Rui Xu , Yong Luo , Bo Du , Kaiming Kuang , Jiancheng Yang

分类：计算机视觉

2022-08-03

卷积神经网络（CNN）已被证明在肺结核检测领域非常有效。但是，现有的基于CNN的肺结核检测方法缺乏捕获长期依赖性的能力，这对于全局信息提取至关重要。在计算机视觉任务中，非本地操作已被广泛使用，但是对于3D计算机断层扫描（CT）图像，计算成本可能很高。为了解决这个问题，我们提出了一个长的短切片网络（LSSANET），用于检测肺结核。特别是，我们开发了一种称为长短切片组（LSSG）的新的非本地机制，该机制将紧凑的非本地嵌入分裂为一个短距离切片，分组为一和长距离切片。这不仅减轻了计算负担，而且还可以在切片和整个功能图中保持长期依赖性。提出的LSSG易于使用，可以插入许多肺结核检测网络中。为了验证LSSANET的性能，我们将基于2D/3D CNN的几种最近提出的竞争检测方法进行比较。大规模PN9数据集的有希望的评估结果证明了我们方法的有效性。代码在https://github.com/ruixxxx/lssanet上。

translated by 谷歌翻译

Multi-parametric Analysis for Mixed Integer Linear Programming: An Application to Transmission Planning and Congestion Control

Jian Liu , Rui Bo , Siyuan Wang

分类：机器学习

2022-07-19

增强现有传输线是对抗传输拥塞并保证传输安全性随需求增加并增强可再生能源的有用工具。这项研究涉及选择其容量应扩大的线路的选择，以及从独立系统操作员（ISO）的角度来看，通过考虑传输线约束以及发电和需求平衡条件，并结合坡道 - 上升和启动坡道率，关闭坡道速率，坡度降低率限制以及最小降低时间。为此，我们开发了ISO单元承诺和经济调度模型，并将其作为混合整数线性编程（MILP）问题的右侧不确定性多个参数分析。我们首先放松二进制变量，以连续变量并采用拉格朗日方法和Karush-Kuhn-Tucker条件，以获得最佳的解决方案（最佳决策变量和目标函数）以及与主动和无效约束相关的关键区域。此外，我们通过确定每个节点处的问题上限，然后比较上限和下限之间的差异，并在决策制造商中达到近似最佳解决方案，从而扩展传统分支和界限方法，以解决大规模MILP问题。可耐受的误差范围。另外，目标函数在每行参数上的第一个衍生物用于告知各行的选择，以简化拥塞和最大化社会福利。最后，通过平衡目标函数的成本率和阵容升级成本来选择容量升级的量。我们的发现得到了数值模拟的支持，并为传输线计划提供了决策指导。

translated by 谷歌翻译

Revisiting Whole-Slide Image Pyramids for Cancer Prognosis via Dual-Stream Networks

Pei Liu , Bo Fu , Feng Ye , Rui Yang , Bin Xu , Luping Ji

分类：计算机视觉 | 机器学习

2022-06-12

Gigapixel全斜面图像（WSIS）上的癌症预后一直是一项艰巨的任务。大多数现有方法仅着眼于单分辨率图像。利用图像金字塔增强WSI视觉表示的多分辨率方案尚未得到足够的关注。为了探索用于提高癌症预后准确性的多分辨率解决方案，本文提出了双流构建结构，以通过图像金字塔策略对WSI进行建模。该体系结构由两个子流组成：一个是用于低分辨率WSIS，另一个是针对高分辨率的WSIS。与其他方法相比，我们的方案具有三个亮点：（i）流和分辨率之间存在一对一的关系；（ii）添加了一个平方池层以对齐两个分辨率流的斑块，从而大大降低了计算成本并启用自然流特征融合；（iii）提出了一种基于跨注意的方法，以在低分辨率的指导下在空间上在空间上进行高分辨率斑块。我们验证了三个公共可用数据集的计划，来自1,911名患者的总数为3,101个WSI。实验结果验证（1）层次双流表示比单流的癌症预后更有效，在单个低分辨率和高分辨率流中，平均C-指数上升为5.0％和1.8％ ; （2）我们的双流方案可以胜过当前最新方案，而C-Index的平均平均值为5.1％；（3）具有可观察到的生存差异的癌症疾病可能对模型复杂性具有不同的偏好。我们的计划可以作为进一步促进WSI预后研究的替代工具。

translated by 谷歌翻译

Time in a Box: Advancing Knowledge Graph Completion with Temporal Scopes

Ling Cai , Krzysztof Janowic , Bo Yan , Rui Zhu , Gengchen Mai

分类：人工智能

2021-11-12

几乎所有知识库的陈述都有时间范围，在此期间它们有效。因此，在时间知识库（TKB）上的知识库完成（KBC），其中每个陈述\ TEXTIT {MAY}与时间范围相关联，引起了不断的关注。先前作品假设TKB \ Texit {必须}中的每个语句都与时间范围相关联。这忽略了kB中常规缺少的范围信息。因此，在此之前的工作通常不能处理通用用例，其中TKB由具有/没有已知的时间范围的时间语句组成。为了解决这个问题，我们建立了一个名为time2box的新知识库嵌入框架，可以同时处理不同类型的atemporal和时间陈述。我们的主要洞察力是时间查询的答案始终属于时间不可知的对应物的答案子集。换句话说，时间是一个过滤器，有助于在某些时期内挑选答案。我们介绍框以将一组答案实体代表到一个时间不可知的查询。时间过滤功能由这些框的交叉点建模。此外，我们概括了关于时间间隔预测的当前评估协议。我们描述了两个数据集上的实验，并表明所提出的方法优于链路预测和时间预测上的最先进的（SOTA）方法。

translated by 谷歌翻译

A Review of Location Encoding for GeoAI: Methods and Applications

Gengchen Mai , Krzysztof Janowicz , Yingjie Hu , Song Gao , Bo Yan , Rui Zhu , Ling Cai , Ni Lao

分类：机器学习

2021-11-07

在更广泛的地球科学中对人工智能模型的常见需求是表示和编码各种类型的空间数据，例如点（例如，兴趣点），折线（例如，轨迹），多边形（例如，行政区域），图（例如，运输网络）或栅格（例如，遥感图像），隐藏的嵌入空间中，使得它们可以容易地结合到深度学习模型中。一个基本步骤是将单个点位置编码为嵌入空间，使得该嵌入对下游机器学习模型（例如支持向量机和神经网络）进行学习友好。我们调用此过程位置编码。但是，对位置编码的概念，其潜在应用以及需要解决的关键挑战缺乏系统审查。本文旨在填补这一差距。我们首先提供了一个正式的编码定义，并讨论了从机器学习角度从机械研究编码的必要性。接下来，我们提供关于当前地点景观研究的全面调查和讨论。我们根据其输入和编码方法将位置编码模型分类为不同类别，并基于它们是参数，多尺度，距离保存和方向意识的方式进行比较。我们证明现有的位置编码模型可以在共享配方框架下统一。我们还讨论了不同类型的空间数据的位置编码的应用。最后，我们指出了在未来需要解决的研究中的几个挑战。

translated by 谷歌翻译

FedICT: Federated Multi-task Distillation for Multi-access Edge Computing

Zhiyuan Wu , Sheng Sun , Yuwei Wang , Min Liu , Xuefeng Jiang , Bo Gao

分类：机器学习

2023-01-01

The growing interest in intelligent services and privacy protection for mobile devices has given rise to the widespread application of federated learning in Multi-access Edge Computing (MEC). Diverse user behaviors call for personalized services with heterogeneous Machine Learning (ML) models on different devices. Federated Multi-task Learning (FMTL) is proposed to train related but personalized ML models for different devices, whereas previous works suffer from excessive communication overhead during training and neglect the model heterogeneity among devices in MEC. Introducing knowledge distillation into FMTL can simultaneously enable efficient communication and model heterogeneity among clients, whereas existing methods rely on a public dataset, which is impractical in reality. To tackle this dilemma, Federated MultI-task Distillation for Multi-access Edge CompuTing (FedICT) is proposed. FedICT direct local-global knowledge aloof during bi-directional distillation processes between clients and the server, aiming to enable multi-task clients while alleviating client drift derived from divergent optimization directions of client-side local models. Specifically, FedICT includes Federated Prior Knowledge Distillation (FPKD) and Local Knowledge Adjustment (LKA). FPKD is proposed to reinforce the clients' fitting of local data by introducing prior knowledge of local data distributions. Moreover, LKA is proposed to correct the distillation loss of the server, making the transferred local knowledge better match the generalized representation. Experiments on three datasets show that FedICT significantly outperforms all compared benchmarks in various data heterogeneous and model architecture settings, achieving improved accuracy with less than 1.2% training communication overhead compared with FedAvg and no more than 75% training communication round compared with FedGKT.

translated by 谷歌翻译

Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?

Wenhao Wu , Haipeng Luo , Bo Fang , Jingdong Wang , Wanli Ouyang

分类：计算机视觉

2022-12-31

Most existing text-video retrieval methods focus on cross-modal matching between the visual content of offline videos and textual query sentences. However, in real scenarios, online videos are frequently accompanied by relevant text information such as titles, tags, and even subtitles, which can be utilized to match textual queries. This inspires us to generate associated captions from offline videos to help with existing text-video retrieval methods. To do so, we propose to use the zero-shot video captioner with knowledge of pre-trained web-scale models (e.g., CLIP and GPT-2) to generate captions for offline videos without any training. Given the captions, one question naturally arises: what can auxiliary captions do for text-video retrieval? In this paper, we present a novel framework Cap4Video, which makes use of captions from three aspects: i) Input data: The video and captions can form new video-caption pairs as data augmentation for training. ii) Feature interaction: We perform feature interaction between video and caption to yield enhanced video representations. iii) Output score: The Query-Caption matching branch can be complementary to the original Query-Video matching branch for text-video retrieval. We conduct thorough ablation studies to demonstrate the effectiveness of our method. Without any post-processing, our Cap4Video achieves state-of-the-art performance on MSR-VTT (51.4%), VATEX (66.6%), MSVD (51.8%), and DiDeMo (52.0%).

translated by 谷歌翻译

Pensieve 5G: Implementation of RL-based ABR Algorithm for UHD 4K/8K Content Delivery on Commercial 5G SA/NR-DC Network

Kasidis Arunruangsirilert , Bo Wei , Hang Song , Jiro Katto

分类：机器学习

2022-12-29

While the rollout of the fifth-generation mobile network (5G) is underway across the globe with the intention to deliver 4K/8K UHD videos, Augmented Reality (AR), and Virtual Reality (VR) content to the mass amounts of users, the coverage and throughput are still one of the most significant issues, especially in the rural areas, where only 5G in the low-frequency band are being deployed. This called for a high-performance adaptive bitrate (ABR) algorithm that can maximize the user quality of experience given 5G network characteristics and data rate of UHD contents. Recently, many of the newly proposed ABR techniques were machine-learning based. Among that, Pensieve is one of the state-of-the-art techniques, which utilized reinforcement-learning to generate an ABR algorithm based on observation of past decision performance. By incorporating the context of the 5G network and UHD content, Pensieve has been optimized into Pensieve 5G. New QoE metrics that more accurately represent the QoE of UHD video streaming on the different types of devices were proposed and used to evaluate Pensieve 5G against other ABR techniques including the original Pensieve. The results from the simulation based on the real 5G Standalone (SA) network throughput shows that Pensieve 5G outperforms both conventional algorithms and Pensieve with the average QoE improvement of 8.8% and 14.2%, respectively. Additionally, Pensieve 5G also performed well on the commercial 5G NR-NR Dual Connectivity (NR-DC) Network, despite the training being done solely using the data from the 5G Standalone (SA) network.

translated by 谷歌翻译

Reviewing Labels: Label Graph Network with Top-k Prediction Set for Relation Extraction

Bo Li , Wei Ye , Jinglei Zhang , Shikun Zhang

分类：自然语言处理

2022-12-29

The typical way for relation extraction is fine-tuning large pre-trained language models on task-specific datasets, then selecting the label with the highest probability of the output distribution as the final prediction. However, the usage of the Top-k prediction set for a given sample is commonly overlooked. In this paper, we first reveal that the Top-k prediction set of a given sample contains useful information for predicting the correct label. To effectively utilizes the Top-k prediction set, we propose Label Graph Network with Top-k Prediction Set, termed as KLG. Specifically, for a given sample, we build a label graph to review candidate labels in the Top-k prediction set and learn the connections between them. We also design a dynamic $k$-selection mechanism to learn more powerful and discriminative relation representation. Our experiments show that KLG achieves the best performances on three relation extraction datasets. Moreover, we observe that KLG is more effective in dealing with long-tailed classes.

translated by 谷歌翻译